671 research outputs found
High-dimensional semi-supervised learning: in search for optimal inference of the mean
We provide a high-dimensional semi-supervised inference framework focused on
the mean and variance of the response. Our data are comprised of an extensive
set of observations regarding the covariate vectors and a much smaller set of
labeled observations where we observe both the response as well as the
covariates. We allow the size of the covariates to be much larger than the
sample size and impose weak conditions on a statistical form of the data. We
provide new estimators of the mean and variance of the response that extend
some of the recent results presented in low-dimensional models. In particular,
at times we will not necessitate consistent estimation of the functional form
of the data. Together with estimation of the population mean and variance, we
provide their asymptotic distribution and confidence intervals where we
showcase gains in efficiency compared to the sample mean and variance. Our
procedure, with minor modifications, is then presented to make important
contributions regarding inference about average treatment effects. We also
investigate the robustness of estimation and coverage and showcase widespread
applicability and generality of the proposed method
Recommended from our members
Low-Complexity Modeling for Visual Data: Representations and Algorithms
With increasing availability and diversity of visual data generated in research labs and everyday life, it is becoming critical to develop disciplined and practical computation tools for such data. This thesis focuses on the low complexity representations and algorithms for visual data, in light of recent theoretical and algorithmic developments in high-dimensional data analysis.
We first consider the problem of modeling a given dataset as superpositions of basic motifs. This model arises from several important applications, including microscopy image analysis, neural spike sorting and image deblurring. This motif-finding problem can be phrased as "short-and-sparse" blind deconvolution, in which the goal is to recover a short convolution kernel from its convolution with a sparse and random spike train. We normalize the convolution kernel to have unit Frobenius norm and then cast the blind deconvolution problem as a nonconvex optimization problem over the kernel sphere. We demonstrate that (i) in a certain region of the sphere, every local optimum is close to some shift truncation of the ground truth, when the activation spike is sufficiently sparse and long, and (ii) there exist efficient algorithms that recover some shift truncation of the ground truth under the same conditions. In addition, the geometric characterization of the local solution as well as the proposed algorithm naturally extend to more complicated sparse blind deconvolution problems, including image deblurring, convolutional dictionary learning.
We next consider the problem of modeling physical nuisances across a collection of images, in the context of illumination-invariant object detection and recognition. Illumination variation remains a central challenge in object detection and recognition. Existing analyses of illumination variation typically pertain to convex, Lambertian objects, and guarantee quality of approximation in an average case sense. We show that it is possible to build vertex-description convex cone models with worst-case performance guarantees, for nonconvex Lambertian objects. Namely, a natural detection test based on the angle to the constructed cone guarantees to accept any image which is sufficiently well approximated with an image of the object under some admissible lighting condition, and guarantees to reject any image that does not have a sufficiently approximation. The cone models are generated by sampling point illuminations with sufficient density, which follows from a new perturbation bound for point images in the Lambertian model. As the number of point images required for guaranteed detection may be large, we introduce a new formulation for cone preserving dimensionality reduction, which leverages tools from sparse and low-rank decomposition to reduce the complexity, while controlling the approximation error with respect to the original cone. Preliminary numerical experiments suggest that this approach can significantly reduce the complexity of the resulting model
Toward Guaranteed Illumination Models for Non-Convex Objects
Illumination variation remains a central challenge in object detection and
recognition. Existing analyses of illumination variation typically pertain to
convex, Lambertian objects, and guarantee quality of approximation in an
average case sense. We show that it is possible to build V(vertex)-description
convex cone models with worst-case performance guarantees, for non-convex
Lambertian objects. Namely, a natural verification test based on the angle to
the constructed cone guarantees to accept any image which is sufficiently
well-approximated by an image of the object under some admissible lighting
condition, and guarantees to reject any image that does not have a sufficiently
good approximation. The cone models are generated by sampling point
illuminations with sufficient density, which follows from a new perturbation
bound for point images in the Lambertian model. As the number of point images
required for guaranteed verification may be large, we introduce a new
formulation for cone preserving dimensionality reduction, which leverages tools
from sparse and low-rank decomposition to reduce the complexity, while
controlling the approximation error with respect to the original cone
- …